Uber Pickup Analysis of Bengaluru City

library(pickupanalysis)

Introduction

The goal of this project is to learn ggplot2 visualisation for a spatial dataset and to deliver basic analytics for the same using R language. The motivation for working with this project is to get a holistic understanding of the usage of R language in analytics domain and also to develop a familiarity with ggplot2 visualisations. Analysing Uber’s pickup data for Bengaluru City using plots and maps provide a better understanding of the large dataset, compared to viewing data in the dataframe.

Dataset Description

Two datasets are available with this package:-

  1. “boundaries” : This dataset that contains the names and geometry of areas in Bangalore City dataset. This dataset is downloaded from the uber movement dataset that is available online. The dataset is slightly modified to remove redundant columns. Source: movement.uber.com

  2. “date_hour” : This dataset provides the number of trips for the first quarter of 2020 for the city Bengaluru, India. This dataset is slightly modified for the purpose of effective storage into this package.

Package dependencies for this package “pickupanalysis”


library(jsonlite)   #Simple and fast json parser and generator for R
library(sf)    #Support for simple features, a standardized way to encode spatial vector data
#> Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1; sf_use_s2() is TRUE
library(tidyverse)   #collection of open source packages for the R programming language
#> -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
#> v ggplot2 3.3.6     v purrr   0.3.4
#> v tibble  3.1.7     v dplyr   1.0.9
#> v tidyr   1.2.0     v stringr 1.4.0
#> v readr   2.1.2     v forcats 0.5.1
#> -- Conflicts ------------------------------------------ tidyverse_conflicts() --
#> x dplyr::filter()  masks stats::filter()
#> x purrr::flatten() masks jsonlite::flatten()
#> x dplyr::lag()     masks stats::lag()
library(dplyr)    #prominant data manipulation library
library(tidyr)    #tidies messy data
library(ggplot2)    #provides support for plots and visualisation
library(ggthemes)    #provides themes for ggplots
library(lubridate)   #helps wrangling with date, time format
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
library(DT)     #helps project dataframes as html friendly datatables
library(scales)    #automatically determines aesthetics for plots and maps
#> 
#> Attaching package: 'scales'
#> The following object is masked from 'package:purrr':
#> 
#>     discard
#> The following object is masked from 'package:readr':
#> 
#>     col_factor
library(mapview)    #provides create interactive visualisations of spatial data
library(ggpubr)    #To create publication ready plots
library(viridis)    #provides color scales for ggplots
#> Loading required package: viridisLite
#> 
#> Attaching package: 'viridis'
#> The following object is masked from 'package:scales':
#> 
#>     viridis_pal

Loads the boundaries dataset and builds additional columns into the dataset for analysis purpose.


data(boundaries)
sf::sf_use_s2(FALSE) #Handles spherical geometry error
#> Spherical geometry (s2) switched off
boundaries$centroid <- st_centroid(boundaries$geometry)
#> Warning in st_centroid.sfc(boundaries$geometry): st_centroid does not give
#> correct centroids for longitude/latitude data
dim(boundaries)
#> [1] 198   4

Mapview Bangalore boundaries dataset.


mapview(boundaries)
boundaries
5 km
3 mi
Leaflet | © OpenStreetMap contributors © CARTO

Convert centroid sf object to lat and long into the boundaries dataset.


lat_long_centroid <- boundaries$centroid
class(lat_long_centroid)
#> [1] "sfc_POINT" "sfc"
lat_long_mat <- st_coordinates(lat_long_centroid)
lat_long_tab <- as.data.frame(lat_long_mat)
boundaries$long <- lat_long_tab[,1]
boundaries$lat <- lat_long_tab[,2]
head(boundaries)
#> Simple feature collection with 6 features and 4 fields
#> Active geometry column: geometry
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 77.53672 ymin: 12.9849 xmax: 77.76003 ymax: 13.14367
#> Geodetic CRS:  WGS 84
#> # A tibble: 6 x 6
#>   MOVEMENT_ID WARD_NAME                       geometry            centroid
#>   <chr>       <chr>                 <MULTIPOLYGON [°]>         <POINT [°]>
#> 1 1           Chowdeswari W~ (((77.59229 13.0972, 77.~ (77.58042 13.12171)
#> 2 2           Atturu         (((77.56862 13.12705, 77~  (77.56004 13.1028)
#> 3 3           Yelahanka Sat~ (((77.59094 13.09842, 77~ (77.58393 13.09099)
#> 4 4           Vijnanapura    (((77.67683 13.01147, 77~ (77.66957 13.00606)
#> 5 5           Basavanapura   (((77.72899 13.02061, 77~ (77.71546 13.01685)
#> 6 6           Hudi           (((77.73583 13.01279, 77~ (77.70549 13.02238)
#> # ... with 2 more variables: long <dbl>, lat <dbl>

Dataset acquiring from our package.

data(date_hour)
head(date_hour)
#>   sourceid dstid month day start_hour end_hour
#> 1      102    97     3  13         10       16
#> 2       98    55     2  11          0        7
#> 3      148   111     1  11         16       19
#> 4       58    22     2   8         16       19
#> 5       54    62     1  18         16       19
#> 6      143   161     2   1         16       19

Data wrangling for the purpose of user friendly purpose for our analysis. By looking at the dataset, the trips are categorised into five hourslot classes. To best understand it, the hour slots are converted into human readable hour slots.

date_hour$hour_add <- rowSums(date_hour[, c("start_hour", "end_hour")])
s <- unique(sort(date_hour$hour_add))
date_hour$Hour_slots <- as.factor(ifelse(date_hour$hour_add == s[1], 'Midnight to Early Morning',
                     ifelse(date_hour$hour_add == s[2], 'Morning Peak',
                     ifelse(date_hour$hour_add == s[3], 'After sunset',
                     ifelse(date_hour$hour_add == s[4], 'Mid-Day', 'Evening Peak')))))
date_hour$Hour_slots <- factor(date_hour$Hour_slots, levels=c("Midnight to Early Morning", "Morning Peak", "Mid-Day", "Evening Peak", "After sunset"))

Converting the day and month names human readable.

m <- unique(sort(date_hour$month))
date_hour$month_names <- as.ordered(ifelse(date_hour$month == m[1], 'January',
                                   ifelse(date_hour$month == m[2], 'Febraury', 'March')))
date_hour$month_names <- factor(date_hour$month_names, levels=c("January", "Febraury", "March"))
date_hour$year <- as.numeric(2022)
date_hour$date_format <-as.Date(with(date_hour,paste(year,month,day,sep="-")),"%Y-%m-%d")
date_hour$day_of_week <- factor(wday(date_hour$date_format, label=TRUE))
date_hour <- date_hour[complete.cases(date_hour), ]
dim(date_hour)
#> [1] 7545777      12
head(date_hour)
#>   sourceid dstid month day start_hour end_hour hour_add
#> 1      102    97     3  13         10       16       26
#> 2       98    55     2  11          0        7        7
#> 3      148   111     1  11         16       19       35
#> 4       58    22     2   8         16       19       35
#> 5       54    62     1  18         16       19       35
#> 6      143   161     2   1         16       19       35
#>                  Hour_slots month_names year date_format day_of_week
#> 1                   Mid-Day       March 2022  2022-03-13         Sun
#> 2 Midnight to Early Morning    Febraury 2022  2022-02-11         Fri
#> 3              Evening Peak     January 2022  2022-01-11         Tue
#> 4              Evening Peak    Febraury 2022  2022-02-08         Tue
#> 5              Evening Peak     January 2022  2022-01-18         Tue
#> 6              Evening Peak    Febraury 2022  2022-02-01         Tue

Splitting the dataset into three months of the first quarter.


jan_ban <- date_hour[date_hour$month == 1, ]
feb_ban <- date_hour[date_hour$month == 2, ]
mar_ban <- date_hour[date_hour$month ==3, ]

str(jan_ban)
#> 'data.frame':    2935699 obs. of  12 variables:
#>  $ sourceid   : int  148 54 59 95 140 99 52 94 87 152 ...
#>  $ dstid      : int  111 62 12 73 191 33 82 83 25 117 ...
#>  $ month      : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ day        : int  11 18 18 31 11 31 18 31 26 29 ...
#>  $ start_hour : int  16 16 16 19 16 19 16 19 10 10 ...
#>  $ end_hour   : int  19 19 19 0 19 0 19 0 16 16 ...
#>  $ hour_add   : num  35 35 35 19 35 19 35 19 26 26 ...
#>  $ Hour_slots : Factor w/ 5 levels "Midnight to Early Morning",..: 4 4 4 5 4 5 4 5 3 3 ...
#>  $ month_names: Ord.factor w/ 3 levels "January"<"Febraury"<..: 1 1 1 1 1 1 1 1 1 1 ...
#>  $ year       : num  2022 2022 2022 2022 2022 ...
#>  $ date_format: Date, format: "2022-01-11" "2022-01-18" ...
#>  $ day_of_week: Ord.factor w/ 7 levels "Sun"<"Mon"<"Tue"<..: 3 3 3 2 3 2 3 2 4 7 ...

Plotting the number of trips for each months by day of the month.

daily_data_jan <- jan_ban %>% group_by(day) %>% dplyr::summarize(Total = n())
jan <- ggplot(daily_data_jan, aes(day, Total)) + 
geom_bar(stat="identity", 
         fill="steelblue", 
         color="red") + 
ggtitle("Trips Every Day in January", subtitle = "January days") + 
theme(legend.position = "none", 
      plot.title = element_text(hjust = 0.5), 
      plot.subtitle = element_text(hjust = 0.5)) + 
scale_y_continuous(labels=comma)

daily_data_feb <- feb_ban %>% group_by(day) %>% dplyr::summarize(Total = n())

feb <- ggplot(daily_data_feb, aes(day, Total)) + 
geom_bar(stat="identity", 
         fill="steelblue", 
         color="red") + 
ggtitle("Trips Every Day in Febraury", subtitle = "Febraury Days") + 
theme(legend.position = "none", 
      plot.title = element_text(hjust = 0.5), 
      plot.subtitle = element_text(hjust = 0.5)) + 
scale_y_continuous(labels=comma)

daily_data_mar <- mar_ban %>% group_by(day) %>% dplyr::summarize(Total = n())

mar <- ggplot(daily_data_mar, aes(day, Total)) + 
geom_bar(stat="identity", 
         fill="steelblue", 
         color="red") + 
ggtitle("Trips Every Day in March", subtitle = "March Days") + 
theme(legend.position = "none", 
      plot.title = element_text(hjust = 0.5), 
      plot.subtitle = element_text(hjust = 0.5)) + 
scale_y_continuous(labels=comma)

ggarrange(jan, feb, mar + rremove("x.text"), 
          ncol = 2, nrow = 2)

Plot1: Total number of trips for each months in the first quarter

Total trips by hour slots and months.

doubleplot <- date_hour %>% group_by(month_names, Hour_slots) %>%  dplyr::summarize(Total = n())
#> `summarise()` has grouped output by 'month_names'. You can override using the
#> `.groups` argument.

ggplot(doubleplot, aes(Hour_slots, Total, fill=month_names)) + 
geom_bar(stat = "identity") + 
ggtitle("Trips by Hour Slots and Months") + 
scale_y_continuous(labels = comma)

Plot 2: Total number of trips by hourslots and months

Data table that shows correlation between day of the month and hour slots

day_hour <- date_hour %>% group_by(day, Hour_slots) %>% dplyr::summarize(Total = n())
#> `summarise()` has grouped output by 'day'. You can override using the `.groups`
#> argument.
datatable(day_hour)

Data table that shows correlation between day of the month and hour slots

Heat Map by hourslots and day of month

ggplot(day_hour, aes(day, Hour_slots, fill = Total)) + 
geom_tile(color = "white") + 
ggtitle("Heat Map by Hour and Day")

Plot 3: Heat Map by hourslots and day of month

Total number of trips by month


month_levels <- date_hour %>% group_by(month_names) %>% dplyr::summarize(Total = n())

ggplot(month_levels, aes(month_names, Total, fill = month_names)) + 
geom_bar(stat = "Identity") + 
ggtitle("Trips in a month") + 
theme(legend.position = "none") + 
scale_y_continuous(labels = comma)

Plot 4: Total number of trips by month

Total number of trips by day of week and month


day_month_data <- date_hour %>% group_by(day_of_week, month_names) %>% dplyr::summarize(Trips = n())
#> `summarise()` has grouped output by 'day_of_week'. You can override using the
#> `.groups` argument.
ggplot(day_month_data, aes(day_of_week, month_names, fill = Trips)) + 
geom_tile(color = "white") + 
ggtitle("Heat Map by Month and Day")

Plot 5: Total number of trips by day of week and month

Trips by day of week and month.


ggplot(day_month_data, aes(day_of_week, Trips, fill = month_names)) + 
geom_bar(stat = "identity", aes(fill = month_names), position = "dodge") + 
ggtitle("Trips by Day and Month") + 
scale_y_continuous(labels = comma) +
coord_cartesian(ylim=c(150000,500000))

Plot 6: Trips by day of week and month

Distribution of trips by day of the week.


ggplot(day_month_data, aes(day_of_week, Trips, fill = day_of_week)) + 
  geom_boxplot() +
  scale_y_log10() +
  ggtitle("Distribution of Trips by Weekday") +
  theme(legend.position = "top") +
  coord_flip()

Plot 7: Distribution of trips by day of the week

Understanding the nature of dataset by viewing the head of it.

head(date_hour)
#>   sourceid dstid month day start_hour end_hour hour_add
#> 1      102    97     3  13         10       16       26
#> 2       98    55     2  11          0        7        7
#> 3      148   111     1  11         16       19       35
#> 4       58    22     2   8         16       19       35
#> 5       54    62     1  18         16       19       35
#> 6      143   161     2   1         16       19       35
#>                  Hour_slots month_names year date_format day_of_week
#> 1                   Mid-Day       March 2022  2022-03-13         Sun
#> 2 Midnight to Early Morning    Febraury 2022  2022-02-11         Fri
#> 3              Evening Peak     January 2022  2022-01-11         Tue
#> 4              Evening Peak    Febraury 2022  2022-02-08         Tue
#> 5              Evening Peak     January 2022  2022-01-18         Tue
#> 6              Evening Peak    Febraury 2022  2022-02-01         Tue

Data Wrangling for further analysis


boundaries$MOVEMENT_ID <- as.integer(boundaries$MOVEMENT_ID)

date_hour <- left_join(date_hour, boundaries %>% dplyr::select(WARD_NAME, MOVEMENT_ID), by = c('sourceid' = 'MOVEMENT_ID'))

date_hour <- date_hour %>% rename(pickup_area = WARD_NAME)

date_hour <- left_join(date_hour, boundaries %>% dplyr::select(WARD_NAME, MOVEMENT_ID), by = c('dstid' = 'MOVEMENT_ID'))

date_hour <- date_hour %>% rename(drop_area = WARD_NAME)

Extracting the Total number trips from different pickup areas of Bangalore


wardname_pickups <- date_hour %>% 
    group_by(pickup_area) %>% 
    tally() %>% 
    arrange(desc(n))
datatable(wardname_pickups)

Extracting information from dataset in the form of datatable by grouping month, pickup area and month names and day of month


by_month_ward_name <- date_hour %>% 
    group_by(month, pickup_area, month_names, day) %>% 
    tally() %>% 
    arrange(pickup_area, month)
datatable(by_month_ward_name)

Extracting day of week and pickup area.


ward_by_wkday <-  date_hour%>% 
    group_by(day_of_week, pickup_area) %>%
    tally() %>% 
    arrange(pickup_area, day_of_week)
datatable(ward_by_wkday)

Extracting day of week, pickup area and hour slots from the dataset.


wkday_ward_hslots <- date_hour %>% 
    group_by(day_of_week, pickup_area, Hour_slots) %>%
    tally() %>% 
    arrange(pickup_area, Hour_slots , day_of_week)
datatable(wkday_ward_hslots)

Plotting Uber Pickup Trend over the three months span from Subhash Nagar, Kadugodi, Begur, Kempegowda Ward, Kengeri and Koramangala.


plot_usage_growth <- by_month_ward_name %>% 
    filter(pickup_area %in% c("Subhash Nagar","Kadugodi","Begur","Kempegowda Ward", "Kengeri", "Koramangala")) %>%
    mutate(Date = paste(day, month, "2022", sep = "-")) %>% 
    mutate(Date = dmy(Date)) %>% 
    ggplot(aes(Date, n, colour = pickup_area)) + 
    geom_line() +
    theme_bw() +
    ggtitle("Uber Pick-Ups Growth")
plot_usage_growth    

Plot 8: Uber Pickup Trend for five wards in Bengaluru

Uber pickups by day of week, pickup_area and hour slots from Subhash Nagar, Kadugodi, Begur, Kempegowda Ward, Kengeri and Koramangala.


plot_all_pickup <- wkday_ward_hslots %>% 
    filter(pickup_area %in% c("Subhash Nagar","Kadugodi","Begur","Kempegowda Ward", "Kengeri", "Koramangala")) %>% 
    ggplot(aes(day_of_week, Hour_slots)) +
    geom_point(aes(size = n, colour = n)) + 
    theme_bw() +
    theme(legend.position = "none") +
    ggtitle("Pick-ups by Day, Time and Pickup Area") +
    ylab("Time") +
    facet_grid(pickup_area ~ .) +
    scale_colour_gradient(low = "lightsteelblue", high= "midnightblue")
plot_all_pickup

Plot 9: Uber pickups by week, slots and pickup area from five wards in Bengaluru

Pickups by day of week, hourslots from Koramangala.


plot_Koramangala <- wkday_ward_hslots %>% 
    filter(pickup_area %in% c("Koramangala")) %>% 
    ggplot(aes(day_of_week, Hour_slots)) +
    geom_point(aes(size = n, colour = n)) + 
    theme_bw() +
    theme(legend.position = "none") +
    ggtitle("Pick-ups by Day, Time in Koramangala") +
    ylab("Time") +
    scale_colour_gradient(low = "lightsteelblue", high= "midnightblue")
plot_Koramangala

Plot 10: Pickups by day of week, hourslots from Koramangala

Uber pickups heatmap by day of week and hour slots from Koramangala region.


heatmap_Koramangala <- wkday_ward_hslots  %>% 
    filter(pickup_area %in% c("Koramangala")) %>%
    ggplot(aes(day_of_week, Hour_slots, fill = n/3)) +
    geom_tile(color="white", size=0.1) +
    scale_fill_viridis(name="# Events/hour") +
    coord_equal() +
    labs(x=NULL, y= "Time", title="Pick-Ups per weekday & time of day in Koramangala") 
heatmap_Koramangala

Plot 11: Heatmap by day of week and hour slots from Koramangala region

Generating similar heatmap with other areas namely, Kadugodi, Begur along with Koramangala.


heatmap_other_pickuparea <- wkday_ward_hslots  %>% 
    filter(pickup_area %in% c("Kadugodi","Begur", "Koramangala")) %>% 
    ggplot(aes(day_of_week, Hour_slots, fill = n/3)) +
    geom_tile(color="white", size=0.1) +
    scale_fill_viridis(name="# Events/hour") +
    coord_equal() +
    labs(x=NULL, y= "Time", title="Pick-Ups per weekday & time of day other pick-up area") +
    facet_grid(. ~ pickup_area)
heatmap_other_pickuparea

Plot 12: Similar heatmap with other areas namely, Kadugodi, Begur along with Koramangala

Centroids of polygon boundaries to understand the nature of city.


a <- ggplot(boundaries, aes(x = long, y = lat)) + 
  geom_point() + 
  coord_equal() + 
  xlab('Longitude') + 
  ylab('Latitude')

b <- ggplot(boundaries, aes(x = long, y = lat)) + 
  coord_equal() + 
  xlab('Longitude') + 
  ylab('Latitude') + 
  stat_density2d(aes(fill = ..level..), alpha = .5,
                 geom = "polygon", data = boundaries) + 
  scale_fill_viridis_c() + 
  theme(legend.position = 'none')

ggarrange(a, b + rremove("x.text"), 
          ncol = 2, nrow = 1)

Plot 13: Centroid of the city

Creating a dataset boundaries_geom only with the geometry and movement_id for ease of referencing. Applying st_union on the bangalore boundaries dataset for better displying of the the bangalore are holistically under the plot. Creating a subset of boundaries dataset for further plotting.


boundaries_geom <- boundaries %>% dplyr::select(MOVEMENT_ID, geometry)
bangalore_union <- st_union(boundaries_geom)
#> although coordinates are longitude/latitude, st_union assumes that they are planar
boundaries_geom_p <- boundaries %>% dplyr::select(MOVEMENT_ID, long, lat, WARD_NAME)

Mapping the total number of pickups with the wardname for the first three months.


wardname_pickups_map <- date_hour %>% 
    group_by(pickup_area, sourceid) %>% 
    tally() %>% 
    arrange(desc(n))
datatable(wardname_pickups_map)

Plot 14: Total number of pickups with the wardname for the first three months

wardname_pickups_map <- as.data.frame(wardname_pickups_map)
wardname_pickups_map <- wardname_pickups_map %>% rename(Total_Pickups = n)
wardname_pickups_map <- merge(x=wardname_pickups_map, y=boundaries_geom, by.x='sourceid', by.y='MOVEMENT_ID')
wardname_pickups_map <- st_as_sf(wardname_pickups_map)
mapviewOptions(fgb = FALSE)
wardname_pickups_map %>% mapview(zcol = "Total_Pickups", legend = TRUE, col.regions = sf.colors)
. - Total_Pickups
10,00020,00030,00040,00050,00060,00070,000

5 km
3 mi
Leaflet | © OpenStreetMap contributors © CARTO
wardname_pickups_map %>% select(Total_Pickups) %>% plot(graticule = TRUE, axes = TRUE)

Plot 14: Total number of pickups with the wardname for the first three months

Bubble Plots for the number of pickups in the first quarter of 2020 in Bengaluru City.


ggplot() +
  geom_sf(data = bangalore_union, fill="grey", alpha=0.3) +
  geom_point(data=boundaries_geom_p, aes(x=long, y=lat, size=wardname_pickups_map$Total_Pickups, color=wardname_pickups_map$Total_Pickups)) +
  scale_size_continuous(range=c(1,12)) +
  scale_color_viridis(trans="log") +
  theme_void()+ coord_sf(xlim = c(77.45, 77.80), ylim = c(12.80, 13.15), expand = FALSE)

Plot 15: Bubble Plots for the number of pickups in the first quarter of 2020

Observation and Analysis from the above plots:-

Plot 1: Although definitive pattern is not observed with this plot, total number of trips by day in March shows a significant decline. This can be attributed to the first instance of covid19 situation in Bengaluru. It can also be observed that, there have been uber trips even after the declaration of Nationwide lockdown, although drastically reduced from normal.

Plot 2: Bar plot provides a clear idea that the demand for uber cars and autos is higher in the mid-day hours compared to the morning peak(7AM - 10AM) and evening peak hours(4PM - 7PM). To my surprise, there is also significant amount of demand in the midnight to early morning hour slots. Given the number of hours for each slots, highest demand for uber is spotted at the late after sunset, compared to evening and morning peak hours.

Plot 3: In general, there is a high demand in the mid day hours all through the three months

Plot 4: Total Trips by month plot shows clear decline in the number of trips over three months, this can be partly attributed to the covid situation.

Plot 5: On examining the heat map, although by general trend, highest demand is seen in the weekend. There is also equally similar demands in Mondays for the three months

Plot 6: On viewing this plot, different picture emerges. Month of February and March shows high demand during Wednesday compared to the weekends and Mondays.

Plot 7: In this plot, Mondays show the highest of bandwidths of demand compared to the even the weekends

Plot 8: Trendlines provides demand by ward names over the three months. Subhash Nagar shows highest demand, Kadugodi shows the lowest demand. This code snippet can be used to find the trendlines for any location in Bengaluru. Drastic decline is spotted towards the end of March.

Plot 9: This plot, provides demand by hour slots and day of week for six wards in Bengaluru. Subhash Nagar shows demand in all segments, whereas Kadugodi shows low demand in all segments and also shows general trend as exhibited in other plots.

Plot 10: This plot exhibits demand in Koramangala area in Bengaluru. This plot provides clear picture of uber pickup demand. Generally increasing demand is observed, peaking after sunset in all days. Lowest demand is found over the Early mornings of Fridays, this observation is significant as the lowest demand is not over the mid days of the week, but on Fridays.

Plot 11: This plot shows correspondence to the previous plot

Plot 12: Provides visible changes in the demand on comparison of Koramangala’s demand with others.

Plot 13: Provides a brief idea of concentration of wards in the Bengaluru city

Plot 14: Provides map of demands of uber for three months of 2020 in Bengaluru. This map provides a nice idea of the demands by wards in Bengaluru.

Plot 15: This bubble plot provides the concentration of demand in and around the Bengaluru city.

Conclusion:-

These plots provides insights on demands in Bengalure City. To my surprise, different type of plots from the same dataset provides different insights. Although the scope of analysis is less as the dataset is only for the first quarter of 2020, this project can be a good starting point for larger analysis. Each and every plots can be scaled up for better insights.